An Empirical Comparison of Three BoostingAlgorithms on Real Data Sets with Arti cialClass

نویسندگان

  • Ross A. McDonald
  • David J. Hand
  • Idris A. Eckley
چکیده

Boosting algorithms are a means of building a strong ensemble classiier by aggregating a sequence of weak hypotheses. In this paper we consider three of the best-known boosting algorithms: Adaboost 8], Logitboost 10] and Brownboost 7]. These algorithms are adaptive, and work by maintaining a set of example and class weights which focus the attention of a base learner on the examples that are hardest to classify. We conduct an empirical study to compare the performance of these algorithms , measured in terms of overall test error rate, on ve real data sets. The tests consist of a series of cross-validatory samples. At each validation , we set aside one third of the data chosen at random as a test set, and t the boosting algorithm to the remaining two thirds, using binary stumps as a base learner. At each stage we record the nal training and test error rates, and report the average errors within a 95% conndence interval. We then add artiicial class noise to our data sets by randomly reassigning 20% of class labels, and repeat out experiment. We nd that Brownboost proves the least likely to overrt in this circumstance, because the algorithm incorporates an extra parameter which allows it to compensate for noisy examples.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of different empirical methods for estimating ddaily reference evapotranspiration in the humid cold climate (case study: Borujen, Shahrekord, Koohrang and Lordegan)

The proposed method for calculation of potential evapotranspiration is Penman-Monteith FAO method, but there are other methods that require less meteorological data but estimates close to the FAO Penman-Monteith method in different climatic conditions.  Performance evaluation of these methods on the same basis is prerequisite for selecting an alternative approach in accordance with available da...

متن کامل

Which Methodology is Better for Combining Linear and Nonlinear Models for Time Series Forecasting?

Both theoretical and empirical findings have suggested that combining different models can be an effective way to improve the predictive performance of each individual model. It is especially occurred when the models in the ensemble are quite different. Hybrid techniques that decompose a time series into its linear and nonlinear components are one of the most important kinds of the hybrid model...

متن کامل

On the use of Heronian means in a similarity classifier

This paper introduces new similarity classifiers using the Heronian mean, and the generalized Heronian mean operators. We examine the use of these operators at the aggregation step within the similarity classifier. The similarity classifier was earlier studied with other operators, in particular with an arithmetic mean, generalized mean, OWA operators, and many more. The two classifiers here ar...

متن کامل

Numerical Simulation of Free Surface Flows and Comparison of Symmetry and Real Boundary Conditions at the Free Surface

For implementation&#10 of the free surface boundary condition, a new subroutine has been introduced to an existing steady 3-D body fitted code. This code was previously written for steady flow simulation in closed ducts. The algorithm used in this subroutine reduces the instability problem according to the free surface wave generation. For code validation, it was applied to two different open c...

متن کامل

Numerical Simulation of Free Surface Flows and Comparison of Symmetry and Real Boundary Conditions at the Free Surface

For implementation of the free surface boundary condition, a new subroutine has been introduced to an existing steady 3-D body fitted code. This code was previously written for steady flow simulation in closed ducts. The algorithm used in this subroutine reduces the instability problem according to the free surface wave generation. For code validation, it was applied to two different open cha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003